Conditionally linear Gaussian models for estimating vocal tract resonances

نویسندگان

Daniel Rudoy

Daniel N. Spendley

Patrick J. Wolfe

چکیده

Vocal tract resonances play a central role in the perception and analysis of speech. Here we consider the canonical task of estimating such resonances from an observed acoustic waveform, and formulate it as a statistical model-based tracking problem. In this vein, Deng and colleagues recently showed that a robust linearization of the formant-to-cepstrum map enables the effective use of a Kalman filtering framework. We extend this model both to account for the uncertainty of speech presence by way of a censored likelihood formulation, as well as to explicitly model formant cross-correlation via a vector autoregression, and in doing so retain a conditionally linear and Gaussian framework amenable to efficient estimation schemes. We provide evaluations using a recently introduced public database of formant trajectories, for which results indicate improvements from twenty to over 30% per formant in terms of root mean square error, relative to a contemporary benchmark formant analysis tool.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On instantaneous vocal tract length estimation from formant frequencies

The length of the vocal tract and its relationship with formant frequencies is examined at fine temporal scales with the goal of providing accurate estimates of vocal tract length from acoustics on a spectrum-by-spectrum basis despite unknown articulatory information. Accurate vocal tract length estimation is motivated by applications to speaker normalization and biometrics. Analyses presented ...

متن کامل

Conditional Dependence in Longitudinal Data Analysis

Mixed models are widely used to analyze longitudinal data. In their conventional formulation as linear mixed models (LMMs) and generalized LMMs (GLMMs), a commonly indispensable assumption in settings involving longitudinal non-Gaussian data is that the longitudinal observations from subjects are conditionally independent, given subject-specific random effects. Although conventional Gaussian...

متن کامل

Continuous Voice Morphing Using Separated Vocal Tract Area Functions and Glottal Source Waves

This paper presents a flexible voice morphing method, which is based on a conversion using a linear combination of the vocal tract area functions estimated from speech signals. The method focuses on the continuity of the phonological identity of the overall interpolated area. The main features of the method are 1) to separate characteristics of the vocal tract resonances from those of glottal s...

متن کامل

Robust Emotion Recognition using Pitch Synchronous and Sub-syllabic Spectral Features

This chapter discusses the use of vocal tract information for recognizing the emotions. Linear prediction cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are used as the correlates of vocal tract information. In addition to LPCCs and MFCCs, formant related features are also explored in this work for recognizing emotions from speech. Extraction of the above mentioned ...

متن کامل

A novel instrument to measure acoustic resonances of the vocal tract during phonation

Acoustic resonances of the vocal tract give rise to formants (broad bands of acoustic power) in the speech signal when the vocal tract is excited by a periodic signal from the vocal folds. This paper reports a novel instrument which uses a real-time, non-invasive technique to measure these resonances accurately during phonation. A broadband acoustic current source is located just outside the mo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Conditionally linear Gaussian models for estimating vocal tract resonances

نویسندگان

چکیده

منابع مشابه

On instantaneous vocal tract length estimation from formant frequencies

Conditional Dependence in Longitudinal Data Analysis

Continuous Voice Morphing Using Separated Vocal Tract Area Functions and Glottal Source Waves

Robust Emotion Recognition using Pitch Synchronous and Sub-syllabic Spectral Features

A novel instrument to measure acoustic resonances of the vocal tract during phonation

عنوان ژورنال:

اشتراک گذاری